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ABSTRACT 

)M^  n  balls  are  randomly  distributed  into  N  cells,  so  that  no  cell  may 
contain  more  than  one  ball.  ThiB  process  is  repeated  m  times.  In  addition, 
balls  may  disappear;  such  disappearances  are  independent  and  identically 
Bernoulli  distributed.  Conditions  are  given  under  which  the  number  of  empty 
cells  has  an  asymptotically  (N  standard  normal  distribution. 
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SIGNIFICANCE  AND  EXPLANATION 


Some  asymptotic  properties  of  an  occupancy  model  which  includes  many 
classical  models  as  special  cases  are  studied. 


The  responsibility  for  the  wording  and  views  expressed  in  this  descriptive 
susaary  lies  with  MRC,  and  not  with  the  authors  of  this  report# 


THE  DISTRIBUTION  OF  THE  NUMBER  OF  EMPTY  CELLS  IN  A  GENERALIZED 
RANDOM  ALLOCATION  SCHEME* 

i  2 

Bernard  Harris,  Morris  Marden  and  C.  J.  Park 
1 .  INTRODUCTION 

The  distribution  of  the  number  of  empty  cells  In  the  following 
random  allocation  process  Is  considered.  Let  n,  N  be  positive  Inte¬ 
gers  with  n  s  N.  Assume  that  n  balls  are  randomly  distributed  Into 

N  cells,  so  that  no  cell  may  contain  more  than  one  ball.  Then,  the 

N  “1 

probability  that  each  of  n  specified  cells  will  be  occupied  Is  (Q)  . 

N  m 

This  process  Is  repeated  m  times,  so  that  there  are  (Jj)  random 
allocations  of  nm  balls  among  the  N  cells.  In  addition,  for  each 
ball,  let  p,  0  s  p  s  1,  be  the  probability  that  the  ball  will  not 
"disappear"  from  the  cell.  The  "disappearances"  are  assumed  to  be 
stochastically  Independent  for  each  ball;  thus  the  disappearances  con¬ 
stitute  a  sequence  of  nm  Bernoulli  trials. 

Several  special  cases  of  this  problem  have  previously  been  con¬ 
sidered  .  In  particular,  p  •  1,  n  *  1  Is  the  classical  occupancy 
problem,  see  [2], [3], [10].  The  case  p  *  1,  n  arbitrary  has  been 
discussed  In  [4]  and  [7].  The  case  0<p<l,  n=l  Is  treated  in 
C.  0.  Park  [5], 

In  this  paper,  we  obtain  the  probability  distribution  and  moments 
of  the  number  of  empty  cells.  In  section  3,  we  show  that  the  number  of 
★ 
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empty  cells  may  be  represented  as  a  sum  of  independent  Bernoulli  ran¬ 
dom  variables.  This  representation  permits  us  to  determine  conditions 
on  m,  n,  p,  N  such  that  the  number  of  empty  cells  is  asymptotically 
normally  distributed. 

This  random  allocation  process  may  be  viewed  as  a  filing  or 
storage  process.  Objects  are  randomly  assigned  to  files  or  storage  bins. 
From  time  to  time,  objects  may  be  missing  or  have  disappeared. 


-2- 


2.  THE  PROBABILITY  DISTRIBUTION  AND  THE  MOMENTS  OF  THE  NUMBER 


OF  EMPTY  CELLS 


Let  m,n,N  be  positive  Integers  with  n  s  N.  m  sets,  each  con¬ 
sisting  of  n  balls,  are  distributed  Into  N  cells  at  random  so  that 
no  cell  can  contain  more  than  one  ball  from  the  same  set.  As  each  set 
Is  distributed,  the  balls  that  have  been  placed  during  the  preceding 
distributions  are  left  In  the  cells.  Thus,  at  the  end  of  the  process, 
cells  may  contain  as  many  as  m  balls.  In  addition,  each  ball  may 
"disappear"  with  common  probability  1  •  p,  Os  p  s  1.  These  disappear¬ 
ances  are  stochastically  Independent  and  thus  constitute  a  sequence  of 
mn  Bernoulli  trials. 

Let  Pm  „  m  n(J)  be  the  probability  that  exactly  j  of  the  N 
nun  »p 

cells  are  empty. 

We  now  establish  the  following  theorem. 


Theorem  1. 


m,n,N,p 


c  i  (i-p) '( 

1»0 


n-1  M  1  ;J  » 


0  s  j  s  N  . 


(1) 


Proof.  Let  Av  be  the  event  that  the  vth  cell  Is  empty, 
v  *  1 ,2,...,N.  Then, 
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(2) 


p(a  )  =  i  (j:j)o-p)1f . 

v  n  1sQ  n  1 
For  1  <  Vi  <  v2  £  N, 


Thus,  for  1  <  Vi  <  v2  <  •••  <  s  N, 

«v>v-ny  ■  .  (a) 


Thus,  using  the  inclusion-exclusion  method,  the  probability  that  exactly 
j  cells  are  empty  is 


-m  N 


Pn,n,N,p<J>  *  0  .(5) 


Me  can  write  (5)  In  the  form  (1)  by  letting  r  *  j  +  Jt  . 

We  now  determine  the  factorial  moments  of  S  ,  the  number  of 
empty  cells. 

Theorem  2.  The  vth  factorial  moment  of  S  , 

E(S(v))  -  ( H)'m  H(v)  [  l  (l-p)j(J^)^)]m  .  (6) 

"  j=0  n'J  J 


-4- 


Proof.  From  J.  Riordah  [9],  p.  53,  from  (4),  it  follows  immediately 


that 

E(S(',))  -  <>!<!!>"\  .  (7) 

We  thus  obtain  the  following. 

Corollary.  E(S)  =  N(1  -  ^  ,  (8) 

°!  ’  ♦  z<’-p>S|rT } + 

♦»0  -^>Vn(1-£)V  (9) 


Proof.  From  (7) 

E(S)  =  N{J)’m((N;1)  +  (J:11)(l-p))m  =  N(l-^)m. 

Since 

of  =  E(S(?,)+E(S)  -  ( E( S ) ) 2  , 

the  conclusion  follows  readily  from  (6),  after  some  elementary  calcula¬ 
tions. 
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For  some  purposes,  the  following  equivalent  forms  of  (9)  will 
prove  useful . 


From  Theorem  2,  we  readily  obtain  the  following. 

Theorem  3.  The  factorial  moment  generating  function  of  S  is  given  by 

-  EO+t)s  -  "  (J)tr(")'n(  j  02) 

m  r=0  r  n  j =o  n  J  J 

Note  that  4>m( t )  Is  a  polynomial  in  t  of  degree  N.  This  fact  is 
exploited  In  the  next  section,  where  the  asymptotic  distribution  of 
S  Is  obtained.  In  particular, 

4>0(t)  *  (1  +  t)N  (13) 

and 
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♦  ,(t)  *  (Ut)N_n(1  +(1-P)t)n. 


04) 


We  now  Investigate  the  asymptotic  distribution  properties  of  the 
number  of  empty  cells. 


3.  THE  ASYMPTOTIC  DISTRIBUTION  OF  THE  NUMBER  OF  EMPTY  CELLS 

In  this  section,  we  determine  conditions  under  which  the  number  of 
empty  cells  (when  suitably  normalized)  has  an  asymptotically  normal  distri¬ 
bution.  In  order  to  establish  this,  a  number  of  preliminary  results  are 
required. 

Lema  1  ♦  Let  N,n,r  be  non-negative  integers,  r  <  n  <  N.  Then 


Proof.  Since  (^)  *  0  whenever  v  <  a,  we  can  write 


To  obtain  the  conclusion,  note  that 


r 


l 

x=0 


E{X(a)}/a!  , 


where  X  has  the  hypergeometric  distribution.  From  B.  Harris  [1], 
p.  105, 


r 


I 

x-0 


c> 


r<“>n(c,) 

nKi 
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The  conclusion  follows  Immediately. 


Lemma  2. 


n  i-n^xrip3 

V  _ j  j 

i.  .  N  > 


r  <1-P)v(”'r)(r) 
r  n-v'  v 

L  _  .N. 


Proof.  The  right-hand  side  of  (16)  may  be  written 


j  (N'rHr)  v  . . 

Jo  j  Q(-i)V- 

(„)  j>0  J 


ii-'iV  npOO 

j=0  v=j  J 


Thus,  the  coefficient  of  pJ  is 


(-l)j  T  (V)(N-r)(r)/(N) 

j  n-v' Vv"  xn' 

v  *) 


From  Lemma  1 , 


from  which  the  conclusion  follows  immediately.  Employing  the  above 
lemmas,  we  can  now  establish  the  following  theorem. 
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and 


00  whenever 


mnp 


The  conclusion  Is  obvious  whenever  r  >  0 

If  a  -*■  00  as  N  •*  ®  ,  then 


<2  =  Ne"a  +  0(Ne"2a) 


and 


.3/2 

->  oo  whenever  3a  -  log  N  ■+ 
N 
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Proof.  From  (11),  we  can  write,  for  a  -*■  0  , 


<2  =  N(e”a)(l-e"a  -  ape~a)  +  O(npa)  +  0(p2a2) 

where  a  =  .  Then,  as  a  -*•  0  , 

<2  =  N(l-a+az/2)(a-^-- ap  +  a2p)  +  0(Na3) +0(mna). 

Then,  if  p  -*■  p*  /  1  , 

<2  =  Na(l-p)  +  0(Na2) 


and 


then 


<2  =  Na(l  -  p)  +  o(Ma(l  -p)) 
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Theorem  6.  V  *  (S-E(S))/o^  has  an  asymptotically  standard  normal 
distribution  as  N  -*■  whenever  any  of  the  following  conditions  are 
satisfied. 


1.  Sp-0.  P  +  P*  *  1  and  ^ 


2.  -*■  0,  O-p)  +0  so  that  for  some  c  >  0, 


(1-p)  *  c(^j-p)  +  o((^^)  ),  0  <  p  <  1,  and 


mnp 


3.  0  >  (1-p)  *  +  o{(S£)  ),  pal,  and 


mnp 

7* 


4#  !™£  r  >  0  ; 


3mpn 


5.  -*■  °°  and  ^ 


log  N 
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From  (29) 


log  *m(t) 


nm  log(l-p)  + 


N 


log(t-t<m))  * 


N 

l  log{l+T.t) 
1=1  1 


N 

I 

1=1 


l 

k=l 


(T1*)k  (-i)*5 

k 


Thus, 


and 


Then 


0  <  s  1  , 


sM/v. 


'j Bj.s  KD}'  s  c£  n* 


since  the  8j  ^  do  not  depend  on  N,n,m,  or  p  . 
Ue  now  establish  the  following  theoran. 


(30) 
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f(x)  .  c(x-x,)(x-x2)...(x-xK).  X,  x  x2  x  ...  x  xN. 

the  representation  follows  by  setting  t.  ■  -(t^  )"*  and  noting 

that  UO)  -  A  (0)  •  1. 
m 

Let  =  ^(n.N.m.p)  be  the  cumulants  of  S  and  let 
be  the  factorial  cumulants  of  S.  That  Is, 

00 

1og  $m(t)  -  [  K[v]tv/v!  . 

v*l  L  J 

Then 

'*  ’  j,  1 2  2- 

where  8^  ^  are  the  Stirling  numbers  of  the  second  kind. 

Then,  as  N  -*•  «> , 

V  «  (S  -  E(S))/o$ 

Is  asymptotically  distributed  by  the  standard  normal  distribution 
(mean  0,  variance  unity),  whenever 

-►  0  ,  t  >  2 
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representation  of  the  form 


1.  The  zeros 


The  zeros  of  $Q(t)  are  tj0^  *  »  ...  =  tj^  ■ 

of  ^(t)  are  t{])  *  -l,...,t^  *  -1,  t£J+1  =  -l/O-p) . 

tfj^  »  -1/(1  -p) .  Now  apply  Lenina  3  with  i//(z)  *  ^ (z)  obtaining 
a  «  1,  b  ■  (1-p)*1.  Then,  the  zeros  of  <j>2(t)  are  real  ?md  satisfy 


-(1-p)*2  s  tj2)  s -1,  j  -  1,2 . N. 


It  then  follows  readily  by  Induction  that  the  zeros  of  <t>k( t)  are 
real  and  satisfy 

-(l-p)’k  s  tjk)  s  -1,  j  *  1,2 . N,  k  *  2,3,...  . 

Theorem  5.  For  1  s  n s,  N,  0  s  p  s  1,  m  *  1,  S  has  a 

representation  as  the  sum  of.  N  mutually  independent  Bernoulli  random 
variables.  That  Is,  there  exist  mutually  Independent  Bernoulli  random 
variables,  Yj  =  Yj(N,m,p,n),  j  *  1,2,...,N,  such  that 


S  -  l  v< 
1-1  J 


P{Yj  -  1}  -  Yj  -  1  -  P{Yj  «  0}  . 
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B  >  ^  B  v  =  {z:z  real,  -b(l-p)"1  s  x  s  -a(l-p)"1).  (25) 

P  .-Of*.  P»Y 


Consequently,  Cu  Bp  Is  contained  In  the  Interval  (21),  proving 
the  lemma. 

We  now  establish  the  following  theorem. 

Theorem  4.  Let 

N  ij  y  -m  r  *  u  m 

*.(t>  *  i  (Vr<J)  ( [  (i-p>J(?'X»  . 

m  r*0  r  n  j=0  n  J  J 

Let  tjm\t^ . tj^  be  the  zeros  (not  necessarily  distinct)  of 

tjm^’  J  *  1»2,...,N  are  all  real  and 

-(l-p)_fn  s  t^s  -1,  j  *  1,2,... ,N;m  -0,1 . 

Proof.  From  (19) » 

4>m+1(t)  *  T(<t>m(t)),  m  =  0,1,..., 

and  from  (13), 

♦0(t)  -  (l+t)N  . 
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That  Is,  =  T(t|;( z*) )  Is  a  linear  symmetric 

function  of  zj^ .z^.-.-.z^).  Thus,  the  conditions  of  Walsh's 
theorem  (M.  Marden  [5],  p.  62)  are  satisfied.  Thus,  If 
z!0^*22^’***’2N°)  are  P0*nts  Cy  »  t^’en  there  at  least  one 
point  ^  In  Cy  such  that 

T[(z*-c)N3  *  0  , 

that  Is,  one  can  set  zj^  ■  ?,  z^  *  z^  *  x,  and  preserve  the 

value  0  .  From  (18), 

T[(z*-0"]  *  (z*-UN'"(z*-C-PZ*)"  ■  0. 

*  *  *  _1 
Thus  either  z  *  s  and  therefore  z  Is  In  or  z  *  c(i-p) 

and  z*  Is  In 

BP,y=  {2:lz+(olT)(l-pr1|  <  C(c-a)2  +  y231/2(1-p)“1  .  (23) 

However,  Y  Is  real  and  arbitrary.  Hence  It  Is  clear  that 

C  *  r"''  C*  (z:z  real,  -b  s  x  <  -a)  (24) 

~co<y  <oo 

and 
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and 


^(z)  =  T(iHz))  -  c,  n  (z-zph. 
1  1  a-1  a 

If  the  zeros  of  i^(z)  are  real  and  satisfy 


(20) 


-b  s  xq  *  -a,  a,b  s  0  , 


then  the  zeros  of  ^(z)  are  real  and  satisfy 


b 

'  TnpT 


5  -a  . 


(21) 


Proof.  Let 

Cy  =  {z:|z  +  (c-iY)|  s  [(c-a)z+Y2]1/2,  c^ta+b)}.  (22) 


Clearly  -a  and  -b  are  on  the  boundary  of  the  circular  region  Cy  . 
Consequently  all  zeros  of  ty(z)  are  in  Cy  .  Let  z*  be  a  zero  of 
4>(z).  Let 


c^-zi'W-z^) 


(Z*-Z 


(1) 

N 


). 
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f  i  <-i>J(j)(pt)J  oJ\  j  <|!)tr  f  I  d-p)“(H:D(')\ 

iJ-°— jfr — jy\a’°  I 


.  N 


y  - rr» -  r=Or  I  a=0 - n—  -SL  ’ 


j=0  N 


(j) 


0 


4otr  4  V’-^C'O  f  • 

(J)  ^  ) 


The  conclusion  now  follows  from  Lemma  2. 
Let 


T( f ( t ) ) 


1 1  (-i )j(!;)(pt)J  f(t),  o<P<i.  ns) 

\i-° - *T—  / 


Then,  from  Theorem  3,  we  have  that 


Wt}  =  T(^(t))*  4>n(t)  “  (Ht)N  . 


09) 


Lemma  3.  Extend  the  domain  of  T  to  the  complex  plane,  letting 
z  *  x  +  1y*x,y  real.  Let 


N 

'l'(z)  -  JI  (z-Z  ) 
a*!  a 


-11- 


Theorem  3.  The  factorial  moment  generating  function  of  the  number 
of  empty  cells  <j>m(t)  0 2)  satisfies  the  following  differential- 

difference  equation. 


(-l)j(^)(pt)jDj 

- fa 


4>m( t) ,  m  *  0,1,...,  (17) 


where 


Proof.  For  m  =  0,  4>g(t)  *  (l+t)N;  hence 


(  l  Djl(Ht)N  ■ 

U-° - ITT —  ' 

Nu  1 


I  (-l)J(1)(pt)J  N(j) 

j=° - rfr — 

Nu/ 


[(l+t)N_n(l+t)n'j] 


=  n+t)N“n  I  (-Dj(;)(pt)j(i+t)n-j 

j=0  j 


=  (i+t)N'n(m-pt)n\ 


In  agreement  with  (14). 

Assume  that  (17)  holds  for  m  =  l,2,...,k.  Then,  from  (12), 


-10- 


